Skip to content

[Config Refactor] HunyuanImage3 pipeline configs#2989

Closed
lishunyang12 wants to merge 21 commits into
vllm-project:mainfrom
lishunyang12:config-refactor-4-hunyuan-image3
Closed

[Config Refactor] HunyuanImage3 pipeline configs#2989
lishunyang12 wants to merge 21 commits into
vllm-project:mainfrom
lishunyang12:config-refactor-4-hunyuan-image3

Conversation

@lishunyang12
Copy link
Copy Markdown
Collaborator

@lishunyang12 lishunyang12 commented Apr 21, 2026

Summary

Continuation of RFC #2072. Migrates HunyuanImage-3.0 from the legacy vllm_omni/model_executor/stage_configs/hunyuan_image3_*.yaml files (7 yamls + 2 platform overlays) into the new pipeline.py (topology) + vllm_omni/deploy/<model>.yaml (deployment) split established by #2383.

Variant strategy

Five separate model_types — one per task — registered in _OMNI_PIPELINES. Precedent: qwen2_5_omni + qwen2_5_omni_thinker_only.

model_type Topology Default deploy yaml
hunyuan_image3_t2i AR (stage 0) → DiT (stage 1) with KV transfer deploy/hunyuan_image3_t2i.yaml (+ _fp8.yaml)
hunyuan_image3_it2i AR (mm input, stage 0) → DiT (stage 1) deploy/hunyuan_image3_it2i.yaml
hunyuan_image3_dit_only DiT only (stage 0) BYO — only CI yaml ships
hunyuan_image3_i2t AR only (stage 0) — image+text → text BYO
hunyuan_image3_t2t AR only (stage 0) — text → text BYO

dit_only / i2t / t2t carry only the pipeline.py topology — no default deploy yaml — because hardware sizing for those modes depends on the use case. Users bringing their own deploy yaml just point --pipeline hunyuan_image3_<variant> at it.

T2I path choice

T2I is registered as AR→DiT (matching the official bot_task="text" flow that produces a textual prompt for the DiT). For users wanting Tencent's bot_task="image" semantics (skip the AR side entirely), use --pipeline hunyuan_image3_dit_only with their own deploy yaml. The two paths produce different image quality / latency trade-offs; AR→DiT is the default because it matches the headline modality demonstrated in the official repo.

Deploy yaml consolidation

  • Hardware-tier deltas (1×H20 vs 4×H20 vs 2×L40S etc.) collapse into platforms: sections inside one deploy/hunyuan_image3_<variant>.yaml per task.
  • NPU + XPU overlays moved into platforms.npu / platforms.xpu sections of the corresponding CUDA yaml — mirrors qwen3_omni_moe.yaml structure.
  • FP8 stays as a separate deploy/hunyuan_image3_t2i_fp8.yaml (quantization is not a platform delta).

Field ownership

Following the 2/N decisions:

  • Pipeline (topology): model_arch=HunyuanImage3ForCausalMM, execution_type, input_sources, final_output*, omni_kv_config (KV transfer between AR↔DiT), kv_transfer_criteria, custom_process_input_func (hunyuan_image3.ar2diffusion on DiT stages), AR stop_token_ids: [127957] as sampling_constraints (model-intrinsic until [Follow-up] Deploy/pipeline config follow-ups from #2383 #2887 item 2 lands).
  • Deploy: gpu_memory_utilization, devices, tensor_parallel_size, max_num_seqs, default_sampling_params (per-variant AR sampling differs: t2i=greedy, it2i=temp=0.6/top_p=0.95/top_k=1024), DiT num_inference_steps=50, guidance_scale=2.5, hf_overrides.rope_parameters.mrope_section=[0,32,32]. AR stages keep enforce_eager: true per qwen3_omni_moe convention; DiT stages omit the field so cudagraph runs by default.
  • worker_cls / scheduler_cls are auto-derived from StageExecutionType.LLM_AR via _resolve_execution_mode — not copied.

Cleanup

Deletes:

  • vllm_omni/model_executor/stage_configs/hunyuan_image3_{t2i,t2i_2gpu,moe,moe_dit_2gpu_fp8,it2i,i2t,t2t}.yaml
  • vllm_omni/platforms/{npu,xpu}/stage_configs/hunyuan_image3_t2i.yaml

Updates:

  • Examples and tests that reference the old yaml paths now point to vllm_omni/deploy/.
  • tests/e2e/offline_inference/stage_configs/hunyuan_image3_dit_only_ci.yamltests/e2e/offline_inference/deploy/hunyuan_image3_dit_only_ci.yaml (renamed for consistency with the new schema).
  • examples/offline_inference/hunyuan_image3/end2end.py switched to Omni.from_cli_args(args, parser=parser, **overrides) so argparse defaults don't silently clobber deploy YAML values (override-precedence revisited in [RFC] Sentinel-default precedence for stage engine args #3035 post-0.20.0).

Coordination

Independent of #2977 (HunyuanImage3 has a real config.json at the repo root, so model-type detection works without diffusers_class_name).

Test plan

  • pre-commit run --files <changed-files> passes
  • pytest tests/config/test_pipeline_registry.py -v
  • CI green
  • Manual e2e: t2i, it2i, dit_only on H20

cc @alex-jw-brooks @hsliuustc0106 @nussejzz @TaffyOfficial @xuechendi @xiaohajiayou

@lishunyang12 lishunyang12 changed the title [Config Refactor 4/N] HunyuanImage3 pipeline configs [Config Refactor] HunyuanImage3 pipeline configs Apr 21, 2026
@lishunyang12
Copy link
Copy Markdown
Collaborator Author

No GPU to test. Awaiting

@lishunyang12 lishunyang12 marked this pull request as ready for review April 21, 2026 13:53
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@lishunyang12
Copy link
Copy Markdown
Collaborator Author

@alex-jw-brooks @xiaohajiayou PTAL

@hsliuustc0106
Copy link
Copy Markdown
Collaborator

cc @kechengliu97

@hsliuustc0106
Copy link
Copy Markdown
Collaborator

cc @Semmer2

Copy link
Copy Markdown
Collaborator

@hsliuustc0106 hsliuustc0106 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocker scan

Category Result
Correctness BLOCK
Reliability/Safety PASS
Breaking Changes BLOCK
Test Coverage PASS
Documentation BLOCK
stage_config.py wiring PASS

Blocking issues

1. i2t and t2t modes deleted without migration

PR description says "Five separate model_types" and explicitly lists:

hunyuan_image3_i2t — AR only (stage 0) — Replaces i2t.yaml
hunyuan_image3_t2t — AR only (stage 0) — Replaces t2t.yaml

But neither is registered in pipeline_registry.py, no pipeline definition exists in pipeline.py, and both yamls are simply deleted. The README and end2end.py also remove those modalities entirely.

This is a breaking change for existing users. Either:

  • Add hunyuan_image3_i2t and hunyuan_image3_t2t pipeline definitions + deploy yamls, OR
  • Update the PR description to explicitly state these modes are intentionally dropped (not "Replaced")

2. FP8 deploy yaml mentioned but not included

PR body says:

FP8 stays as a separate deploy/hunyuan_image3_t2i_fp8.yaml (quantization is not a platform delta).

No such file exists in this diff. Both moe_dit_2gpu_fp8.yaml (2x H200 FP8 DiT) and t2i_2gpu.yaml (2-GPU AR) are deleted without replacement. Users on 2-GPU or FP8 setups lose their configs.

3. PR description / implementation mismatch

The description states 5 model_types; only 3 are implemented. The description lists NPU/XPU overlay consolidation, but only the t2i deploy yaml gets platform sections — it2i gets none (was this intentional?).

Non-blocking notes

  • stage_config.py changes look correct — omni_kv_config and requires_multimodal_data are new explicit wire-ups from StagePipelineConfig fields to engine_args/runtime, needed for the pipeline.py framework. No regression risk for existing yaml-based models.
  • hf_architectures (renders as *** in some tools) correctly used on T2I only for model-type fallback.
  • Pipeline topology definitions are clean and well-documented.
  • dit_only_ci.yaml correctly uses the new schema for the e2e test.
  • XPU/NPU consolidation into platforms: sections is the right pattern.

Copy link
Copy Markdown
Contributor

@alex-jw-brooks alex-jw-brooks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I think it looks good - some thoughts, I think the text2text/img2text stuff in the earlier review are also important

enable_expert_parallel: false
vae_use_slicing: false
vae_use_tiling: false
cache_backend: null
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not sure this is the right default value for cache_backend, it might be currently be "none" as a string (e.g., based on places like this)

Since this and some of the others are default values though, I think it would be best to remove them where possible, since it makes the configs noisier

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — yeah, cache_backend: null / cache_config: null / enable_cache_dit_summary: false were all defaults. Removed in df65e00. Same for the matching it2i values.

devices: "4,5,6,7"
parallel_config:
tensor_parallel_size: 4
enable_expert_parallel: false
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is expert parallel disabled in this config, but enabled in hunyuan_image3_it2i.yaml?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy-paste asymmetry — no real reason. Aligned t2i stage 1 to enable_expert_parallel: true to match it2i in df65e00.

- stage_id: 0
max_num_seqs: 1
gpu_memory_utilization: 0.95
enforce_eager: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also curious about enforce_eager=True here

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stage 0 is the AR/MoE side — kept enforce_eager: true per the qwen3_omni_moe.yaml convention (cudagraph capture is unreliable across MoE expert routing during AR token-by-token generation). Flipped stage 1 (DiT) in df65e00 to fall through to the dataclass default False so cudagraph runs there.

devices: "0,1,2,3,4,5,6,7"
parallel_config:
tensor_parallel_size: 8
enable_expert_parallel: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be a good idea to add the NPU config here too, since there was one before. I only see an NPU section in the CI config

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in df65e00 — ported the deleted platforms/npu/stage_configs/hunyuan_image3_t2i.yaml into a platforms.npu section under platforms.xpu.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update on this — reversed in 8d0142e after @TaffyOfficial pointed out platform overlays are stage-wise patches, not full stage-list replacements. An NPU DiT-only override landing on the AR→DiT base would silently leave the AR stage running on NPU.

Split fix: dropped the platforms.npu block from hunyuan_image3_t2i.yaml (kept a pointer comment up top) and ported it to vllm_omni/deploy/hunyuan_image3_dit_only.yaml instead. Users on NPU run --pipeline hunyuan_image3_dit_only with the shipped dit_only deploy. Sorry for the back-and-forth.

@kechengliu97
Copy link
Copy Markdown
Contributor

Looks good. It is necessary to extract the common part rather than "one strategy, one yaml file".

@TaffyOfficial
Copy link
Copy Markdown
Contributor

Re: description says "users can pass --pipeline hunyuan_image3_ with a custom deploy yaml" — but i2t/t2t have no pipeline definition in pipeline.py after this PR, only dit_only does. A user bringing their own deploy yaml would still hit registry lookup failure. If the intent is "BYO", please keep the pipeline.py topology entries for i2t/t2t (small addition, no yaml cost) and drop only the deploy yamls.

@TaffyOfficial
Copy link
Copy Markdown
Contributor

tests/e2e/.../stage_configs/dit_only_ci.yaml reintroduces the stage_configs/ directory name that #2383 explicitly deprecated. Suggest renaming to tests/e2e/.../deploy/hunyuan_image3_dit_only_ci.yaml to stay consistent with the new schema — otherwise follow-up 2c's cleanup will miss it.

@TaffyOfficial
Copy link
Copy Markdown
Contributor

One small design thought — feel free to ignore if this is already settled.The official HunyuanImage-3 repo uses generate_image(prompt, bot_task="image") for T2I, which maps to the DIT_ONLY path rather than going through AR→DiT. I ran into this while setting up a GenEval CI on a fork and ended up registering DIT_ONLY as the default for hunyuan_image_3_moe so that vllm serve ... --omni would Just Work out of the box for T2I users.The current PR keeps hunyuan_image3_t2i as AR→DiT and exposes hunyuan_image3_dit_only as a separate model_type, which is cleaner semantically but means users have to know to pass --pipeline hunyuan_image3_dit_only to match official behavior.Not saying one is right and the other wrong — both have merit. Just thought it'd be worth a sentence in the PR description explaining the choice, so downstream users know which path matches the Tencent reference.

@lishunyang12 lishunyang12 added this to the v0.20.0 milestone Apr 22, 2026
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be necessary to use Omni.from_cli_args() here; otherwise, it won’t be possible to distinguish CLI arguments explicitly provided by the user.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in df65e00 — switched to Omni.from_cli_args(args, parser=parser, **overrides). Note: this distinction is being revisited under #3035 (sentinel-default precedence) post-0.20.0; for now matching today's convention.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be necessary to use Omni.from_cli_args() here; otherwise, it won’t be possible to distinguish CLI arguments explicitly provided by the user.

Hi @xiaohajiayou, I sent you an email requesting for adding me on Wechat to facilicate further conversion.

@lishunyang12
Copy link
Copy Markdown
Collaborator Author

Pushed df65e00 addressing the open review items:

Blockers (cc @hsliuustc0106)

  • Added HUNYUAN_IMAGE3_I2T_PIPELINE and HUNYUAN_IMAGE3_T2T_PIPELINE (AR-only topologies) to pipeline.py and registered them. No default deploy yaml — BYO per @TaffyOfficial's suggestion (hardware sizing for I2T/T2T depends on use case).
  • Added vllm_omni/deploy/hunyuan_image3_t2i_fp8.yaml (2x H200 FP8) — this was promised in the PR body but missing from the diff.
  • Updated PR description to match the actual implementation (5 model_types, FP8 yaml, NPU section, T2I path note).

Inline (cc @alex-jw-brooks)

  • Removed cache_backend / cache_config / enable_cache_dit_summary defaults from t2i.yaml.
  • Aligned t2i stage 1 enable_expert_parallel to true (matching it2i — the asymmetry was copy-paste).
  • Dropped enforce_eager: true from DiT stages in both t2i and it2i (falls through to dataclass default False for cudagraph). Kept on AR stages per qwen3_omni_moe convention.
  • Added platforms.npu section back to t2i.yaml, ported from the deleted NPU overlay.

TaffyOfficial follow-ups

  • Renamed tests/e2e/offline_inference/stage_configs/hunyuan_image3_dit_only_ci.yamltests/e2e/offline_inference/deploy/hunyuan_image3_dit_only_ci.yaml. Updated the test reference.
  • T2I path design note: kept AR→DiT as hunyuan_image3_t2i (matches the bot_task="text" flow). Users wanting Tencent's bot_task="image" (DIT-only) can pass --pipeline hunyuan_image3_dit_only. Documented in the updated PR description.

xiaohajiayou

@lishunyang12
Copy link
Copy Markdown
Collaborator Author

@TaffyOfficial (re: i2t/t2t topology) — Done in df65e00. Added HUNYUAN_IMAGE3_I2T_PIPELINE and HUNYUAN_IMAGE3_T2T_PIPELINE to pipeline.py and registered both in _OMNI_PIPELINES. BYO deploy yaml works as you described — no registry lookup failure.

Signed-off-by: lishunyang <lishunyang12@163.com>
…i + it2i

Image generation is the headline modality. AR-only (i2t/t2t) and DiT-only
runs are niche; users can pass --pipeline hunyuan_image3_<variant> with a
custom deploy yaml. FP8 toggles via --quantization fp8 (DiT-only path
verified; IT2I AR + image FP8 hits an upstream vLLM kernel limitation —
see vllm-project#2976).

dit_only.yaml moved to tests/e2e/.../stage_configs/ as a CI-only fixture;
the dit_only pipeline registration is kept so users can BYO deploy.

Signed-off-by: lishunyang <lishunyang12@163.com>
…s cleanup

Signed-off-by: lishunyang <lishunyang12@163.com>
@lishunyang12 lishunyang12 force-pushed the config-refactor-4-hunyuan-image3 branch from df65e00 to dbadc5c Compare April 22, 2026 16:04
@lishunyang12
Copy link
Copy Markdown
Collaborator Author

@TaffyOfficial (re: stage_configs rename) — Done. Renamed to tests/e2e/offline_inference/deploy/hunyuan_image3_dit_only_ci.yaml and updated test_hunyuanimage3_text2img.py:21 to match.

@lishunyang12
Copy link
Copy Markdown
Collaborator Author

@TaffyOfficial (re: T2I path) — Good point, hadn't documented this. Updated the PR description with a "T2I path choice" subsection: hunyuan_image3_t2i stays AR→DiT (matches bot_task="text"), and users wanting Tencent's bot_task="image" semantics use --pipeline hunyuan_image3_dit_only with their own deploy yaml. Thanks for flagging.

@TaffyOfficial
Copy link
Copy Markdown
Contributor

  1. The platforms.npu section in hunyuan_image3_t2i.yaml looks potentially inconsistent with the top-level pipeline choice.
    The deploy still targets pipeline: hunyuan_image3_t2i (AR → DiT), but the NPU comment says “DiT only — single-stage NPU deployment” and only overrides stage_id: 0.
    Given the current platform override behavior is stage-wise patching rather than replacing the full stage list, this seems likely to keep the other base stage(s) unless explicitly overridden.
    Could you clarify whether the comment is stale, or whether the NPU path is intended to be truly DiT-only? If it is meant to be DiT-only, it may be better to point it at hunyuan_image3_dit_only or make the intent explicit in the config layout.

2.One coverage concern: the e2e text2img test now points to hunyuan_image3_dit_only_ci.yaml, so it no longer exercises the shipped default hunyuan_image3_t2i.yaml nor the AR→DiT KV-transfer path that this PR makes the default T2I route.
Since the refactor’s main semantic choice is exactly that default path, it would be good to keep at least one test covering the shipped hunyuan_image3_t2i deploy (even if the CI-only DiT-only fixture stays for cost/runtime reasons).

Signed-off-by: lishunyang <lishunyang12@163.com>
@lishunyang12
Copy link
Copy Markdown
Collaborator Author

@TaffyOfficial — both addressed in 8d0142e.

  1. NPU section on t2i.yaml was semantically wrong (you were right — platform overlays are stage-wise patches, not full replacements, so an NPU DiT-only override landing on the AR→DiT base was broken). Split fix: dropped the platforms.npu block from hunyuan_image3_t2i.yaml with a pointer comment, and shipped vllm_omni/deploy/hunyuan_image3_dit_only.yaml as the proper home for the DiT-only NPU deployment (CUDA 4x H20 default + NPU 8x A3-64G section). Registry comment updated; hunyuan_image3_dit_only now has a default deploy.

  2. Coverage gap on shipped AR→DiT t2i yaml — added TestHunyuanImage3ShippedDeploys in tests/test_config_factory.py: parametrized parse + pipeline-registry resolution + stage-topology validation across all three shipped deploys (t2i / it2i / dit_only), plus a targeted check that the t2i deploy wires stage 1 to consume stage 0 (KV-transfer path) and pins the 4 AR + 4 DiT placement. Catches schema regressions on the AR→DiT default without burning 8 GPUs. A full e2e on the AR→DiT path needs new golden CLIP embeddings — tracked as a follow-up.

@TaffyOfficial
Copy link
Copy Markdown
Contributor

@TaffyOfficial — both addressed in 8d0142e.

  1. NPU section on t2i.yaml was semantically wrong (you were right — platform overlays are stage-wise patches, not full replacements, so an NPU DiT-only override landing on the AR→DiT base was broken). Split fix: dropped the platforms.npu block from hunyuan_image3_t2i.yaml with a pointer comment, and shipped vllm_omni/deploy/hunyuan_image3_dit_only.yaml as the proper home for the DiT-only NPU deployment (CUDA 4x H20 default + NPU 8x A3-64G section). Registry comment updated; hunyuan_image3_dit_only now has a default deploy.
  2. Coverage gap on shipped AR→DiT t2i yaml — added TestHunyuanImage3ShippedDeploys in tests/test_config_factory.py: parametrized parse + pipeline-registry resolution + stage-topology validation across all three shipped deploys (t2i / it2i / dit_only), plus a targeted check that the t2i deploy wires stage 1 to consume stage 0 (KV-transfer path) and pins the 4 AR + 4 DiT placement. Catches schema regressions on the AR→DiT default without burning 8 GPUs. A full e2e on the AR→DiT path needs new golden CLIP embeddings — tracked as a follow-up.

Thanks, this addresses the two main concerns. The NPU split looks semantically correct now, and the structural coverage for the shipped AR→DiT path is a reasonable CI-cost compromise.

Two small follow-ups:

  1. The PR body / pipeline.py docstring still seem to say dit_only has no shipped default deploy, which is now stale after adding vllm_omni/deploy/hunyuan_image3_dit_only.yaml.
  2. Since these are shipped deploys, the new tests probably should assert the files exist rather than pytest.skip if missing; otherwise deleting a shipped yaml would silently skip the coverage.

… yaml

Signed-off-by: lishunyang <lishunyang12@163.com>
@lishunyang12
Copy link
Copy Markdown
Collaborator Author

@TaffyOfficial — both addressed in e6d526b.

  1. Updated the pipeline.py module docstring: t2i / it2i / dit_only now ship default deploy yamls (with NPU overlay on dit_only); only i2t / t2t are BYO. PR description already reflects this.
  2. Swapped the two pytest.skip paths in TestHunyuanImage3ShippedDeploys for assert deploy_path.exists() — deleting a shipped yaml now fails CI instead of silently skipping.

@lishunyang12
Copy link
Copy Markdown
Collaborator Author

@TaffyOfficial Can you help test on npu side? I will conduct e2e test again today to push towards ready state.

@TaffyOfficial
Copy link
Copy Markdown
Contributor

@TaffyOfficial您能帮忙测试一下CPU端吗?我今天会再次进行端到端测试,以尽快达到就绪状态。

今天可能来不及,明天应该可以

Signed-off-by: lishunyang <lishunyang12@163.com>
@TaffyOfficial
Copy link
Copy Markdown
Contributor

@lishunyang12

测试结果总结

PR #2989 CPU 端测试完成:

✅ 通过的测试

  • tests/config/test_pipeline_registry.py:9/9 passed (10.77s)
    • 所有 pipeline registry 测试通过(lazy loading、dynamic registration、central registry)

❌ 失败的测试

  • tests/test_config_factory.py:85 passed, 5 failed (7.44s)

5 个失败:

  1. TestSentinelDefaultPrecedence::test_none_value_skipped_yaml_wins

TestHunyuanImage3ShippedDeploys::test_shipped_deploys_parse_and_resolve[hunyuan_image3_t2i.yaml-...]
3. TestHunyuanImage3ShippedDeploys::test_shipped_deploys_parse_and_resolve[hunyuan_image3_it2i.yaml-
...]
4. TestHunyuanImage3ShippedDeploys::test_shipped_deploys_parse_and_resolve[hunyuan_image3_dit_only.y
aml-...]
5. TestHunyuanImage3ShippedDeploys::test_t2i_ar_dit_topology

关键错误:
ValueError: Pipeline 'hunyuan_image3_t2i' has async_chunk=True in deploy but no stage
declares a next-stage input processor (async_chunk_process_next_stage_input_func or
custom_process_next_stage_input_func). Either set async_chunk=False or implement an
async-chunk processor on the pipeline.

位置:vllm_omni/config/stage_config.py:788

这是 PR 的真实 bug——HunyuanImage3 的 deploy config 启用了 async_chunk=True,但 pipeline.py
没有声明对应的 async chunk processor。

tests/config/test_pipeline_registry.py │ 9/9 passed ✅ │ 10.77s

tests/test_config_factory.py │ 85 passed, 5 failed ❌ │ 7.44s

5 个失败均为同一根因:merge_pipeline_deploy() 在 stage_config.py:788 校验时报错——HunyuanImage3 的
deploy config 里 async_chunk=True,但 pipeline.py 没有声明 async_chunk_process_next_stage_input_func
或 custom_process_next_stage_input_func。

Signed-off-by: lishunyang <lishunyang12@163.com>
Signed-off-by: lishunyang <lishunyang12@163.com>
Signed-off-by: lishunyang <lishunyang12@163.com>
…ne_args

Signed-off-by: lishunyang <lishunyang12@163.com>
…ide recipe

Signed-off-by: lishunyang <lishunyang12@163.com>
Signed-off-by: lishunyang <lishunyang12@163.com>
… H200

Signed-off-by: lishunyang <lishunyang12@163.com>
Signed-off-by: lishunyang <lishunyang12@163.com>
Signed-off-by: lishunyang <lishunyang12@163.com>
…olocation profiling race

Signed-off-by: lishunyang <lishunyang12@163.com>
@lishunyang12
Copy link
Copy Markdown
Collaborator Author

Closed and took over by #3172

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants